Cross-Strait Lexical Differences: A Comparative Study based on Chinese Gigaword Corpus

نویسندگان

  • Jia-Fei Hong
  • Chu-Ren Huang
چکیده

Studies of cross-strait lexical differences in the use of Mandarin Chinese reveal that a divergence has become increasingly evident. This divergence is apparent in phonological, semantic, and pragmatic analyses and has become an obstacle to knowledge-sharing and information exchange. Given the wide range of divergences, it seems that Chinese character forms offer the most reliable regular mapping between cross-strait usage contrasts. In this study, we take general cross-strait lexical wordforms to discovery of cross-strait lexical differences and explore their contrasts and variations. Based on Hong and Huang (2006), we discuss the same conceptual words between cross-strait usages by WordNet, Chinese Concept Dictionary (CCD) and Chinese Wordnet (CWN). In this study, we take all words which appear in CCD and CWN to check their lexical contrasts of traditional Chinese character data and simplified Chinese character data in Gigaword Corpus, explore their appearances and distributions, and compare and demonstrate them via Google website.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Chinese Gigaword Corpus and Chinese Word Sketch in linguistic Research

We explore the possibility of deeper linguistic research based on corpus and computational linguistic tools in this paper. In particular, we adopt Chinese Word Sketch, the application of Word Sketch Engine to Chinese GigaWord Corpus, for linguistic research. We apply Chinese Sketch Engine results to deeper linguistic account such as selectional restriction and event type selection. The study is...

متن کامل

A Corpus-based Study of Lexical Bundles in Discussion Section of Medical Research Articles

There has been increasing interest in utilizing corpora in linguistic research and pedagogy in recent years. Rhetorical organization of different sections of research articles may appear similar in various disciplines, but close examination may show subtle differences nonetheless. One of the features that has been at the center of attention especially in recent years is the idiomaticity of a di...

متن کامل

Chinese Sketch Engine and the Extraction of Grammatical Collocations

This paper introduces a new technology for collocation extraction in Chinese. Sketch Engine (Kilgarriff et al., 2004) has proven to be a very effective tool for automatic description of lexical information, including collocation extraction, based on large-scale corpus. The original work of Sketch Engine was based on BNC. We extend Sketch Engine to Chinese based on Gigaword corpus from LDC. We d...

متن کامل

Word sketch lexicography: new perspectives on lexicographic studies of Chinese near synonyms

Comparative study of near synonyms is one of the most productive research paradigms in Chinese lexicography. Empirical studies to discriminate near synonyms are either introspection-based or corpus-based. Yet, due to the large quantity of data in a corpus, lexicological studies of Chinese rarely make full use of the corpus data. To solve this problem, Kilgarriff’s Word Sketch Engine is designed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCLCLP

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2013